Model Selection

Swin-BART architecture

# Swin-BART architecture

OCR DocVQA Donut

Donut is an OCR-free document understanding Transformer model that combines a visual encoder and text decoder for document visual question answering tasks.

Donut is an OCR-free document understanding model based on Swin Transformer visual encoder and BART text decoder, this version is fine-tuned on CORD receipt dataset

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase